AITopics

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Neural Information Processing SystemsFeb-8-2026, 05:26:37 GMT

42cd63cb189c30ed03e42ce2c069566c-AuthorFeedback.pdf

architecture, reviewer, search space, (16 more...)

Technology: Information Technology > Artificial Intelligence (0.40)

arXiv.org Artificial IntelligenceDec-4-2025

A Learning-based Control Methodology for Transitioning VTOL UAVs

Lin, Zexin, Zhong, Yebin, Wan, Hanwen, Cheng, Jiu, Sun, Zhenglong, Ji, Xiaoqiang

Transition control poses a critical challenge in Vertical Take-Off and Landing Unmanned Aerial Vehicle (VTOL UAV) development due to the tilting rotor mechanism, which shifts the center of gravity and thrust direction during transitions. Current control methods' decoupled control of altitude and position leads to significant vibration, and limits interaction consideration and adaptability. In this study, we propose a novel coupled transition control methodology based on reinforcement learning (RL) driven controller. Besides, contrasting to the conventional phase-transition approach, the ST3M method demonstrates a new perspective by treating cruise mode as a special case of hover. We validate the feasibility of applying our method in simulation and real-world environments, demonstrating efficient controller development and migration while accurately controlling UAV position and attitude, exhibiting outstanding trajectory tracking and reduced vibrations during the transition process.

controller, machine learning, reinforcement learning, (15 more...)

2512.03548

Country:

Asia > China (0.29)
North America (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Aerospace & Defense > Aircraft (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)

arXiv.org Artificial IntelligenceOct-21-2025

Plasma Shape Control via Zero-shot Generative Reinforcement Learning

Wu, Niannian, Li, Rongpeng, Yang, Zongyu, Xiao, Yong, Wei, Ning, Chen, Yihang, Li, Bo, Zhao, Zhifeng, Zhong, Wulyu

Traditional PID controllers have limited adaptability for plasma shape control, and task-specific reinforcement learning (RL) methods suffer from limited generalization and the need for repetitive retraining. To overcome these challenges, this paper proposes a novel framework for developing a versatile, zero-shot control policy from a large-scale offline dataset of historical PID-controlled discharges. Our approach synergistically combines Generative Adversarial Imitation Learning (GAIL) with Hilbert space representation learning to achieve dual objectives: mimicking the stable operational style of the PID data and constructing a geometrically structured latent space for efficient, goal-directed control. The resulting foundation policy can be deployed for diverse trajectory tracking tasks in a zero-shot manner without any task-specific fine-tuning. Evaluations on the HL-3 tokamak simulator demonstrate that the policy excels at precisely and stably tracking reference trajectories for key shape parameters across a range of plasma scenarios. This work presents a viable pathway toward developing highly flexible and data-efficient intelligent control systems for future fusion reactors.

large language model, machine learning, plasma current, (18 more...)

2510.17531

Country: Asia > China (0.47)

Genre: Research Report (0.64)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)

Thorat, Rohan Vitthal, Singh, Juhi, Nayek, Rajdip

Safe Reinforcement Learning-Based Vibration Control: Overcoming Training Risks with LQR Guidance

arXiv.org Machine LearningOct-3-2025

Structural vibrations induced by external excitations pose significant risks, including safety hazards for occupants, structural damage, and increased maintenance costs. While conventional model-based control strategies, such as Linear Quadratic Regulator (LQR), effectively mitigate vibrations, their reliance on accurate system models necessitates tedious system identification. This tedious system identification process can be avoided by using a model-free Reinforcement learning (RL) method. RL controllers derive their policies solely from observed structural behaviour, eliminating the requirement for an explicit structural model. For an RL controller to be truly model-free, its training must occur on the actual physical system rather than in simulation. However, during this training phase, the RL controller lacks prior knowledge and it exerts control force on the structure randomly, which can potentially harm the structure. To mitigate this risk, we propose guiding the RL controller using a Linear Quadratic Regulator (LQR) controller. While LQR control typically relies on an accurate structural model for optimal performance, our observations indicate that even an LQR controller based on an entirely incorrect model outperforms the uncontrolled scenario. Motivated by this finding, we introduce a hybrid control framework that integrates both LQR and RL controllers. In this approach, the LQR policy is derived from a randomly selected model and its parameters. As this LQR policy does not require knowledge of the true or an approximate structural model the overall framework remains model-free. This hybrid approach eliminates dependency on explicit system models while minimizing exploration risks inherent in naive RL implementations. As per our knowledge, this is the first study to address the critical training safety challenge of RL-based vibration control and provide a validated solution.

acceleration response, controller, reinforcement, (12 more...)

arXiv.org Machine Learning

2510.01269

Country: Asia > India > NCT > Delhi (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Neural Information Processing SystemsOct-2-2025, 18:46:15 GMT

Appendix

A full workflow of channel number search with the proposed transitionary APS is shown in Algorithm 1. The overall procedure consists of two stages. To prove Theorem 3.1, we first show the case of two candidate decision Finally, to prove Theorem 3.1, we only need to extend Lemma 1 to the case of multiple candidate decisions. Here we cover more details in the design. We adopt policy gradient to maximize the reward function.

artificial intelligence, gradient, machine learning, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Neural Information Processing SystemsOct-2-2025, 18:44:12 GMT

42cd63cb189c30ed03e42ce2c069566c-AuthorFeedback.pdf

We sincerely thank all reviewers for their constructive comments. We hope this would shed some light on a better understanding of parameter sharing in NAS. We sincerely appreciate your recognition of our technical contributions. (Line 181). Meanwhile, as you pointed out, different optimization of APS would be interesting to explore in the future.

artificial intelligence, reviewer, search space, (17 more...)

Technology: Information Technology > Artificial Intelligence (0.40)

Nandakumar, Anirud, Banerjee, Chayan, Vanajakshi, Lelitha Devi

Reinforcement Learning Based Traffic Signal Design to Minimize Queue Lengths

arXiv.org Artificial IntelligenceSep-29-2025

Abstract--Efficient traffic signal control (TSC) is crucial for reducing congestion, travel delays, pollution, and for ensuring road safety. Traditional approaches, such as fixed signal control and actuated control, often struggle to handle dynamic traffic patterns. In this study, we propose a novel adaptive TSC framework that leverages Reinforcement Learning (RL), using the Proximal Policy Optimization (PPO) algorithm, to minimize total queue lengths across all signal phases. The challenge of efficiently representing highly stochastic traffic conditions for an RL controller is addressed through multiple state representations, including an expanded state space, an autoencoder representation, and a K-Planes-inspired representation. The proposed algorithm has been implemented using the Simulation of Urban Mobility (SUMO) traffic simulator and demonstrates superior performance over both traditional methods and other conventional RL-based approaches in reducing queue lengths. The best performing configuration achieves an approximately 29% reduction in average queue lengths compared to the traditional Webster method. Furthermore, comparative evaluation of alternative reward formulations demonstrates the effectiveness of the proposed queue-based approach, showcasing the potential for scalable and adaptive urban traffic management. I. INTRODUCTION Traffic signal control (TSC) is a crucial problem that needs to be addressed to manage traffic flows, ensure road safety, reduce delays, and increase efficiency and social benefits.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2509.21745

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsAug-14-2025, 17:48:32 GMT

TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets Chengrun Y ang 1, Gabriel Bender

Previous NAS algorithms designed for image search spaces incorporate resource constraints directly into the reinforcement learning (RL) rewards.

architecture, probability, search space, (16 more...)

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceJul-22-2025

EMP: Executable Motion Prior for Humanoid Robot Standing Upper-body Motion Imitation

Xu, Haocheng, Zhang, Haodong, Chen, Zhenghan, Xiong, Rong

To support humanoid robots in performing manipulation tasks, it is essential to study stable standing while accommodating upper-body motions. However, the limited controllable range of humanoid robots in a standing position affects the stability of the entire body. Thus we introduce a reinforcement learning based framework for humanoid robots to imitate human upper-body motions while maintaining overall stability. Our approach begins with designing a retargeting network that generates a large-scale upper-body motion dataset for training the reinforcement learning (RL) policy, which enables the humanoid robot to track upper-body motion targets, employing domain randomization for enhanced robustness. To avoid exceeding the robot's execution capability and ensure safety and stability, we propose an Executable Motion Prior (EMP) module, which adjusts the input target movements based on the robot's current state. This adjustment improves standing stability while minimizing changes to motion amplitude. We evaluate our framework through simulation and real-world tests, demonstrating its practical applicability.

artificial intelligence, machine learning, robot, (16 more...)

2507.15649

Country: Asia (0.46)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)